Towards Memory Prefetching with Neural Networks: Challenges and Insights
نویسندگان
چکیده
Accurate memory prefetching is paramount for processor performance, and modern processors employ various techniques to identify and prefetch different memory access patterns. While most modern prefetchers target spatio-temporal patterns by matching memory addresses that are accessed in close proximity (either in space or time), the recently proposed concept of semantic locality views locality as an artifact of the algorithmic level and searches for correlations between memory accesses and program state. While this approach was shown to be effective, capturing semantic locality requires significant associative learning capabilities. In this paper we utilize neural networks for this task. Artificial neural networks are becoming increasingly effective in tasks of pattern recognition and associative learning of complex relations. We leverage recent advances in this field to propose a conceptual neural network prefetcher. We show that by targeting semantic locality, this prefetcher can learn distinct memory access patterns that cannot be covered by other state-of-the-art prefetchers. We evaluate the neural network prefetcher over SPEC2006, Graph500, and a variety of handwritten kernels. We show that the prefetcher can deliver an average speedup of 22% for SPEC2006 (up to 90% ) and up to 5× over kernels. We also explore the limitations of using neural networks for prefetching. Ultimately, we conclude that although there are still many challenges to overcome before we can reach a feasible, power-efficient implementation, the neural network prefetcher potential gains over state-of-the-art prefetchers justify further exploration.
منابع مشابه
Learning Memory Access Patterns
The explosion in workload complexity and the recent slow-down in Moore’s law scaling call for new approaches towards efficient computing. Researchers are now beginning to use recent advances in machine learning in software optimizations, augmenting or replacing traditional heuristics and data structures. However, the space of machine learning for computer hardware architecture is only lightly e...
متن کاملAdditional Insights Into Problem Definition and Positioning From Social Science; Comment on “Four Challenges That Global Health Networks Face”
Commenting on a recent editorial in this journal which presented four challenges global health networks will have to tackle to be effective, this essay discusses why this type of analysis is important for global health scholars and practitioners, and why it is worth understanding and critically engaging with the complexities behind these challenges. Focusing on the topics of problem definition ...
متن کاملPerformance Evaluation of Neural Network Prediction for Data Prefetching in Embedded Applications
Embedded systems need to respect stringent real time constraints. Various hardware components included in such systems such as cache memories exhibit variability and therefore affect execution time. Indeed, a cache memory access from an embedded microprocessor might result in a cache hit where the data is available or a cache miss and the data need to be fetched with an additional delay from an...
متن کاملIntegration of remote sensing and meteorological data to predict flooding time using deep learning algorithm
Accurate flood forecasting is a vital need to reduce its risks. Due to the complicated structure of flood and river flow, it is somehow difficult to solve this problem. Artificial neural networks, such as frequent neural networks, offer good performance in time series data. In recent years, the use of Long Short Term Memory networks hase attracted much attention due to the faults of frequent ne...
متن کاملPrefetching Challenges in Distributed Memories for CMPs
Prefetch engines working on distributed memory systems behave independently by analyzing the memory accesses that are addressed to the attached piece of cache. They potentially generate prefetching requests targeted at any other tile on the system that depends on the computed address. This distributed behavior involves several challenges that are not present when the cache is unified. In this p...
متن کامل